security: tighten on-disk secret handling#5
Conversation
There was a problem hiding this comment.
Pull request overview
Security hardening for local persistence by tightening file permissions on cached data, validating session pickle safety before deserialization, and preventing raw exception text from leaking into the login UI.
Changes:
- Lock down SQLite cache DB permissions after creation to reduce unintended readability on POSIX systems.
- Refuse to load (and delete)
session.pickleif ownership/permissions indicate it may be attacker-writable (POSIX). - Replace UI display of raw exception strings during login with a generic message and log the traceback instead.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/data/cache.py |
Applies restrictive permissions to the cache DB file to limit on-disk exposure. |
src/auth/session_manager.py |
Adds a safety gate before unpickling a persisted session and deletes unsafe session files. |
src/auth/login_view.py |
Stops echoing exception messages to users; logs traceback and shows a generic sign-in failure message. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- cache.py: pre-create the SQLite DB file via os.open with mode 0o600 before sqlite3.connect, so there's no TOCTOU window between connect() and chmod() where the file sits with umask-default perms. - session_manager.py: route every chmod(SESSION_FILE) through a small helper that tolerates OSError, and rewrite the safety-gate docstring to honestly describe what it does (cross-user defense) and doesn't (same-user malicious process) mitigate. - tests: add POSIX-only regression tests: cache DB must be 0o600 even under a zero umask, and try_restore_session() must refuse (and delete) session.pickle when it's group/world-writable OR owned by a different uid, without ever calling load_session() on the blob.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address the second review pass on PR #5: - Require the session pickle path to be a regular, non-symlink file via lstat + S_ISREG/S_ISLNK. A directory or FIFO owned by the user would previously have passed the uid+mode check and then failed loudly in load_session (or in the unlink cleanup path, since Path.unlink() can't remove directories). Now we fail-closed before touching the file, and the Windows branch still rejects non-regular paths even though it skips mode bits. - Tests: add POSIX-only regression for symlinks, and a cross-platform regression for a directory planted at the session path (asserts we never call load_session and the directory survives the cleanup attempt without raising). The PR body has also been updated to describe the actual threat model (cross-user tampering only) so the docs and the docstring agree.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address PR #5 third review pass: the os.open pre-create block was catching every OSError, which would silently swallow EACCES / ENOSPC and let sqlite3.connect then create the DB with umask-default perms, defeating the hardening. Narrow the except to FileExistsError (the race between exists() and O_EXCL) and let anything else propagate.
- cache.py: chmod the SQLite cache DB to 0o600 after creation so cached account and transaction data isn't world/group readable on shared POSIX systems (the directory was already 0o700, but the file wasn't locked down like session.pickle / preferences.json). - session_manager.py: refuse to load session.pickle unless it is owned by the current user and not group/world-writable. pickle deserialization of an attacker-writable file is RCE; this closes the gap where another process running as the same user could plant a malicious pickle. - login_view.py: stop echoing raw exception messages into the login UI on the generic except branch. Log the traceback instead and show a generic "Sign-in failed" message so upstream error text can't leak into the UI.
- cache.py: pre-create the SQLite DB file via os.open with mode 0o600 before sqlite3.connect, so there's no TOCTOU window between connect() and chmod() where the file sits with umask-default perms. - session_manager.py: route every chmod(SESSION_FILE) through a small helper that tolerates OSError, and rewrite the safety-gate docstring to honestly describe what it does (cross-user defense) and doesn't (same-user malicious process) mitigate. - tests: add POSIX-only regression tests: cache DB must be 0o600 even under a zero umask, and try_restore_session() must refuse (and delete) session.pickle when it's group/world-writable OR owned by a different uid, without ever calling load_session() on the blob.
Address the second review pass on PR #5: - Require the session pickle path to be a regular, non-symlink file via lstat + S_ISREG/S_ISLNK. A directory or FIFO owned by the user would previously have passed the uid+mode check and then failed loudly in load_session (or in the unlink cleanup path, since Path.unlink() can't remove directories). Now we fail-closed before touching the file, and the Windows branch still rejects non-regular paths even though it skips mode bits. - Tests: add POSIX-only regression for symlinks, and a cross-platform regression for a directory planted at the session path (asserts we never call load_session and the directory survives the cleanup attempt without raising). The PR body has also been updated to describe the actual threat model (cross-user tampering only) so the docs and the docstring agree.
Address PR #5 third review pass: the os.open pre-create block was catching every OSError, which would silently swallow EACCES / ENOSPC and let sqlite3.connect then create the DB with umask-default perms, defeating the hardening. Narrow the except to FileExistsError (the race between exists() and O_EXCL) and let anything else propagate.
c137314 to
d9467ba
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
PR #5 review (Copilot) flagged two remaining gaps in the on-disk hardening: - session_manager.try_restore_session() previously short-circuited on Path.exists(), which returns False for dangling symlinks. A planted symlink at session.pickle would therefore skip the safety gate and stay on disk for a subsequent save_session() to follow. Remove the early exists() check; the safety gate already handles missing files (lstat → OSError → False). Logout's exists()/unlink has the same failure mode — switch to unconditional unlink + FileNotFoundError. - cache.DataCache.__init__ validated db_path with exists(), which follows symlinks. sqlite3.connect() + chmod would then operate on the symlink target. Use lstat() to reject symlinks and non-regular files up front, add O_NOFOLLOW to the pre-create os.open. Added regression tests for both cases.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Second PR #5 review pass raised three more gaps: - cache.py had a TOCTOU: after FileExistsError from the pre-create os.open, we proceeded to sqlite3.connect without re-validating the path. A process racing us could swap the file for a symlink in that window. Restructured so there's always an authoritative lstat() check immediately before sqlite3.connect. - session_manager.save_session was unprotected against a planted symlink/non-regular entry at SESSION_FILE. Added _prepare_session_file_for_write() that lstats and unlinks anything non-regular before MonarchMoney writes the pickle. Called from both login() and login_with_mfa(). - logout() previously only caught FileNotFoundError, so a directory planted at SESSION_FILE would crash logout with IsADirectoryError. Broadened to OSError. Regression tests added for all three.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Third PR #5 review pass: the regular-file check is necessary but not sufficient. If another uid has write access to the cache dir, they can pre-create a regular file owned by them. save_session would then write our pickled session into their file and the follow-up chmod(0o600) would silently fail (not owner); the same for sqlite3.connect on a pre-planted cache DB. - _prepare_session_file_for_write() now requires st_uid == getuid() and no group/world-writable bits on POSIX before trusting an existing regular file. Otherwise unlink + let save_session create fresh. - DataCache.__init__ raises OSError on POSIX if a regular file at db_path is owned by a different uid. Windows accepts any regular file on both paths (no POSIX uid). Regression tests cover foreign-owned session file, world-writable session file, and foreign-owned cache DB.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fourth PR #5 review pass: - _prepare_session_file_for_write tightened: any existing regular file must also have no group/other bits set (mask 0o077, not just the 0o022 writable bits). A legacy 0o644 pickle left behind under a loose umask would otherwise leak the next save_session write to world-readers until _chmod_session_file() tightens it. - cache.py chmod moved to *before* sqlite3.connect so a pre-existing DB with loose perms (or a connect() that errors before the post- chmod) can't be read/written with world-readable mode. For the just-pre-created case (os.open already set 0o600) the extra chmod is a no-op.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fifth PR #5 review pass — both changes narrow: - _prepare_session_file_for_write docstring drifted when the mask tightened from 0o022 to 0o077 last commit. Updated to match: "no group or world permission bits set at all", with the why-of-the read-window spelled out. - cache.py: only run the O_CREAT|O_EXCL pre-create when O_NOFOLLOW is available. Without it, O_CREAT could in principle follow a planted symlink before the lstat gate catches it. In practice this is Windows-only (O_NOFOLLOW is a POSIX feature), where creating symlinks requires admin and POSIX mode bits don't map to NTFS ACLs, so skipping the pre-create is both safer and cheaper. The post-check now tolerates FileNotFoundError so sqlite3.connect can create the file itself on that path.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
O_EXCL and O_NOFOLLOW interact differently across POSIX kernels: macOS raises EEXIST first (caught as FileExistsError, then the lstat gate raises \"non-regular\"), Linux may raise ELOOP directly from os.open. Both are correct rejections, so the test shouldn't depend on which specific errno wins. Assert OSError without message matching.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Avoid implying non-admin Windows users can't create symlinks — Windows 10+ Developer Mode enables it. The actual reason we skip pre-create on those platforms is the missing O_NOFOLLOW primitive and the different permission model (NTFS ACLs), not symlink availability.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
On platforms without O_NOFOLLOW (effectively Windows) we skip the pre-create, so if the DB file doesn't exist yet the chmod-before- connect is a no-op (FileNotFoundError, swallowed) and sqlite3 creates the file with umask-default perms. A second chmod after connect catches that first-run case on those platforms; it's a no-op when the pre-connect chmod already tightened a pre-existing file.
account and transaction data isn't world/group readable on shared
POSIX systems (the directory was already 0o700, but the file wasn't
locked down like session.pickle / preferences.json). The DB file is
now also pre-created with
os.open(..., O_CREAT | O_EXCL, 0o600)before
sqlite3.connect, so it never exists with umask-default perms.regular, non-symlink file owned by the current uid and not group- or
world-writable. Threat model: this defends against cross-user
tampering only (another uid writing into a loosely-permissioned
home, or a symlink/directory planted at that path). It does NOT
defend against a malicious process running as the same user — that
attacker can write a 0o600 file owned by the user that passes every
check. Closing that gap would require removing pickle from the
persistence format or HMAC'ing the blob with a keyring-held key;
neither is in scope for this PR.
on the generic except branch. Log the traceback instead and show a
generic "Sign-in failed" message so upstream error text can't leak
into the UI.